Software Defect Prediction: Heuristics for Weighted Naïve Bayes

نویسندگان

  • Burak Turhan
  • Ayse Basar Bener
چکیده

Defect prediction is an important topic in software quality research. Statistical models for defect prediction can be built on project repositories. Project repositories store software metrics and defect information. This information is then matched with software modules. Naïve Bayes is a well known, simple statistical technique that assumes the ‘independence’ and ‘equal importance’ of features, which are not true in many problems. However, Naïve Bayes achieves high performances on a wide spectrum of prediction problems. This paper addresses the ‘equal importance’ of features assumption of Naïve Bayes. We propose that by means of heuristics we can assign weights to features according to their importance and improve defect prediction performance. We compare the weighted Naïve Bayes and the standard Naïve Bayes predictors’ performances on publicly available datasets. Our experimental results indicate that assigning weights to software metrics increases the prediction performance significantly.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparing the Effectiveness of Machine Learning Algorithms for Defect Prediction

Software repositories with defect logs are main resource for defect prediction. In recent years, researchers have used the vast amount of data that is contained by software repositories to predict the location of defect in the code that caused problem. In this paper machine learning approach is used for predicting the modules with defect for embedded data set. Public datasets from the promise r...

متن کامل

WBCsvm: Weighted Bayesian Classification based on Support Vector Machines

This paper introduces an algorithm that combines naïve Bayes classification with feature weighting. Most of the related approaches to feature transformation for naïve Bayes suggest various heuristics and non-exhaustive search strategies for selecting a subset of features with which naïve Bayes performs better than with the complete set of features. In contrast, the algorithm introduced in this ...

متن کامل

Practical considerations in deploying statistical methods for defect prediction: A case study within the Turkish telecommunications industry

Context: Building defect prediction models in large organizations has many challenges due to limited resources and tight schedules in the software development lifecycle. It is not easy to collect data, utilize any type of algorithm and build a permanent model at once. We have conducted a study in a large telecommunications company in Turkey to employ a software measurement program and to predic...

متن کامل

Extracting software static defect models using data mining

Defect models; Software testing; Software metrics; Defect prediction Abstract Large software projects are subject to quality risks of having defective modules that will cause failures during the software execution. Several software repositories contain source code of large projects that are composed of many modules. These software repositories include data for the software metrics of these modu...

متن کامل

Intelligence System for Software Maintenance Severity Prediction

The software industry has been experiencing a software crisis, a difficulty of delivering software within budget, on time, and of good quality. This may happen due to number of defects present in the different modules of the project that may require maintenance. This necessitates the need of predicting maintenance urgency of the particular module in the software. In this paper, we have applied ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007